NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Gauguin, Descartes, Bayes: A Diurnal Golem’s Brain

https://doi.org/10.1145/3759429.3762631

Chandra, Kartik; Liu, Amanda; Ragan-Kelley, Jonathan; Tenenbaum, Joshua B (October 2025, ACM)

A "quine" is a deterministic program that prints itself. In this essay, I will show you a "gauguine": a probabilistic program that infers itself. A gauguine is repeatedly asked to guess its own source code. Initially, its chances of guessing correctly are of course minuscule. But as the gauguine observes more and more of its own previous guesses, it detects patterns of behavior and gains information about its inner workings. This information allows it to bootstrap self-knowledge, and ultimately discover its own source code. We will discuss how—and why—we might write a gauguine, and what we stand to learn by constructing one.
more » « less
Free, publicly-accessible full text available October 9, 2026
Inferring the Future by Imagining the Past

Chandra, Kartik; Chen, Tony; Li, Tzu-Mao; Ragan-Kelley, Jonathan; Tenenbaum, Joshua (December 2023, Proceedings of 2023 Conference on Neural Information Processing Systems)

A single panel of a comic book can say a lot: it can depict not only where the characters currently are, but also their motions, their motivations, their emotions, and what they might do next. More generally, humans routinely infer complex sequences of past and future events from a static snapshot of a dynamic scene, even in situations they have never seen before. In this paper, we model how humans make such rapid and flexible inferences. Building on a long line of work in cognitive science, we offer a Monte Carlo algorithm whose inferences correlate well with human intuitions in a wide variety of domains, while only using a small, cognitively-plausible number of samples. Our key technical insight is a surprising connection between our inference problem and Monte Carlo path tracing, which allows us to apply decades of ideas from the computer graphics community to this seemingly-unrelated theory of mind task.
more » « less
Full Text Available
How to Guess a Gradient

Singhal, Utkarsh; Cheung, Brian; Chandra, Kartik; Ragan-Kelley, Jonathan; Tenenbaum, Joshua B; Poggio, Tomaso; Yu, Stella X (December 2023, arXivorg)

How much can you say about the gradient of a neural network without computing a loss or knowing the label? This may sound like a strange question: surely the answer is “very little.” However, in this paper, we show that gradients are more structured than previously thought. Gradients lie in a predictable low-dimensional subspace which depends on the network architecture and incoming features. Exploiting this structure can significantly improve gradient-free optimization schemes based on directional derivatives, which have struggled to scale beyond small networks trained on toy datasets. We study how to narrow the gap in optimization performance between methods that calculate exact gradients and those that use directional derivatives. Furthermore, we highlight new challenges in overcoming the large gap between optimizing with exact gradients and guessing the gradients.
more » « less
Full Text Available
Acting as Inverse Inverse Planning

https://doi.org/10.1145/3588432.3591510

Chandra, Kartik; Li, Tzu-Mao; Tenenbaum, Joshua; Ragan-Kelley, Jonathan (July 2023, ACM)

Great storytellers know how to take us on a journey. They direct characters to act—not necessarily in the most rational way—but rather in a way that leads to interesting situations, and ultimately creates an impactful experience for audience members looking on. If audience experience is what matters most, then can we help artists and animators directly craft such experiences, independent of the concrete character actions needed to evoke those experiences? In this paper, we offer a novel computational framework for such tools. Our key idea is to optimize animations with respect to simulated audience members’ experiences. To simulate the audience, we borrow an established principle from cognitive science: that human social intuition can be modeled as “inverse planning,” the task of inferring an agent’s (hidden) goals from its (observed) actions. Building on this model, we treat storytelling as “inverse inverse planning,” the task of choosing actions to manipulate an inverse planner’s inferences. Our framework is grounded in literary theory, naturally capturing many storytelling elements from first principles. We give a series of examples to demonstrate this, with supporting evidence from human subject studies.
more » « less
Gradient Descent: The Ultimate Optimizer

Chandra, Kartik; Xie, Audrey; Ragan-Kelley, Jonathan; Meijer, Erik (December 2022, Advances in neural information processing systems)

Working with any gradient-based machine learning algorithm involves the tedious task of tuning the optimizer's hyperparameters, such as its step size. Recent work has shown how the step size can itself be optimized alongside the model parameters by manually deriving expressions for "hypergradients" ahead of time. We show how to automatically compute hypergradients with a simple and elegant modification to backpropagation. This allows us to easily apply the method to other optimizers and hyperparameters (eg momentum coefficients). We can even recursively apply the method to its own hyper-hyperparameters, and so on ad infinitum. As these towers of optimizers grow taller, they become less sensitive to the initial choice of hyperparameters. We present experiments validating this for MLPs, CNNs, and RNNs. Finally, we provide a simple PyTorch implementation of this algorithm (see http://people. csail. mit. edu/kach/gradient-descent-the-ultimate-optimizer).
more » « less
Designing Perceptual Puzzles by Differentiating Probabilistic Programs

https://doi.org/10.1145/3528233.3530715

Chandra, Kartik; Li, Tzu-Mao; Tenenbaum, Joshua; Ragan-Kelley, Jonathan (August 2022, ACM SIGGRAPH Conference Proceedings)

Full Text Available
Beyond Laurel/Yanny: An Autoencoder-Enabled Search for Polyperceivable Audio

Chandra, Kartik; Kabaghe, Chuma; Valiant, Gregory (January 2021, ACL-IJCNLP)
null (Ed.)
The famous “laurel/yanny” phenomenon references an audio clip that elicits dramatically different responses from different listeners. For the original clip, roughly half the population hears the word “laurel,” while the other half hears “yanny.” How common are such ``polyperceivable'' audio clips? In this paper we apply ML techniques to study the prevalence of polyperceivability in spoken language. We devise a metric that correlates with polyperceivability of audio clips, use it to efficiently find new “laurel/yanny”-type examples, and validate these results with human experiments. Our results suggest that polyperceivable examples are surprisingly prevalent, existing for >2% of English words.
more » « less
Full Text Available

Search for: All records